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PART  1  :  TECHNICAL  REPORT 

A  large  number  of  digital  image  processing  programs  were  developed  on 
our  Grinnell  Image  Processing  System  in  order  to  facilitate  the  research 
outlined  in  the  grant.  We  have  the  capability  of  instituting  most  of  the 
standard  image  processing  techniques  such  as  luminance  histogram  analysis, 
luminance  and  contrast  modification,  high  and  low  pass  filtering  etc.  We  also 
developed  the  software  capable  of  doing  the  following  to  images  or  parts  of 
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images  :  selectively  Inducing  various  degrees  of  blur,  inducing  different 
degrees  of  binocular  disparity  with  the  anaglyphic  method,  interactively 
controlling  the  disparity  in  images  in  real  time,  adding  selected  portions  of 
different  images  together,  interactively  controlling  the  contrast  of  an  image 
in  real  time,  flickering  selected  portions  of  images  at  different  temporal 
frequencies . 

During  the  first  year  of  the  grant  we  had  several  technical  problems 
which  were  solved  by  equipment  replacement.  The  Winchester  Drive  on  the 
computer  hosting  the  Grinnel  crashed  severely  enough  to  warrant  the  purchase 
of  a  CDC  drive  as  replacement.  The  Grinnell  also  had  some  power  supply 
problems  that  were  corrected.  The  GT-40  system  had  been  most  reliable  until 
this  past  year.  A  problem  with  ghost  images  in  the  monitor  screen  has  been 
attributed  to  an  unknown  system  problem  emanating  from  within  the  PDP  11/34. 
The  11/34  was  an  upgrade  from  the  11/03  neceesitated  by  the  failure  of  the 
Plessy  memory  boards  associated  with  the  11/05.  An  RX01  floppy  drive  has  been 
added  to  this  system  allowing  communication  between  the  11/23  (the  Grinnell 
host)  and  the  11/34  (the  GT-40  host) . 

PART  2  :  SCIENTIFIC  PROGRESS  REPORT 


The  initial  three  years  on  this  research  project  have  been  highly 
successful.  Our  pre-proposal  findings  had  shown  that  sharp  line  segments  are 
detected  and  discriminated  better  in  figure  regions,  while  blurred  line 
segments  are  detected  but  not  discriminated  better  in  those  same  regions  when 
seen  as  ground.  Also,  we  had  some  data  showing  that  identification/detection 
ratios  were  larger  for  targets  1.6  degrees  apart  in  orientation  when  these 
targets  were  seen  against  figure  versus  ground  regions.  Finally,  we  had 
preliminary  data  showing  that  flickering  regions  were  predominantly  perceived 
as  ground  behind  nonflickering  regions  which  were  perceived  as  figure. 


On  the  basis  of  these  data  and  on  our  organizaing  hypothesis  discussed 
in  this  report,  that  figure  perception  involves  high  spatial  frequency,  slow 
analysis  of  details  while  ground  involves  the  fast  processing  of  low  spatial 
frequency,  more  global  aspects  of  an  image,  we  made  the  following 
predictions : 

1)  Contrast  sensitivities  for  sinewave  gratings  imaged  in  backgrounds  would 
be  greater  at  low  spatial  frequencies;  those  for  sinewave  gratings  imaged  in 
figure  would  be  greater  at  high  spatial  frequencies. 

2)  The  orientation  bandwidths  for  tilted  targets  would  be  narrower  in  figure 
than  in  ground  regions . 

3)  The  time-course  of  visual  signals  would  be  faster  from  ground  than  from 
figure  regions. 

4)  High  spatial- frequency- filled  regions  would  appear  as  figure  in  relation 
to  low  spaH al - frequency- filled  regions,  which  would  appear  as  ground . 

5)  High  temporal-frequency-regions  would  appear  as  backgrounds  in  relation  to 
low  temporal  -  frequency- filled  regions  whicn  would  appear  as  figure . 

6)  Regions  moving  at  high  velocities  would  appear  as  backgrounds  in  eolation 
to  non-moving  or  slow-moving  regions  which  would  appear  as  figure . 
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Qualitatively,  our  findings  corroborated  these  predictions.  However, 
our  findings  have  provided  some  interesting  amplifications  on  our  organizaing 
hypothesis.  First,  spatial  frequencies  may  be  coded  relatively  rather  than 
absolutely  in  figure-ground  perception.  Second,  although  the  data  are  still 
unclear,  it  appears  that  both  relative  and  absolute  flicker  frequencies  are 
used  in  determining  figure  -  ground  perception.  Third,  our  findings  suggest 
that  temporal  frequency  may  be  a  more  powerful  cue  to  figure  and  ground  than 
spatial  frequency.  Fourth,  some  preliminary  findings  on  the  effects  of 
perceived  velocity  on  figure  -  ground  perception  suggest  that  velocity  and 
temporal  frequency  have  different  effects  on  the  perception  of  figure  and 
ground.  Fifth,  perceived  depth  accompanying  the  spatially  and  temporally 
induced  figure  -  ground  effects  is  crucial  in  maintaining  the  perception  that  a 
region  is  figure  or  ground.  Sixth,  images  filled  with  regions  of  different 
spatial  01  cemporal  frequency  yield  multi-stable  percepts.  Regions  filled 
with  higher  spatial  or  lower  temporal  frequencies  are  perceived  more  often  as 
figure  than  background,  but  they  are  perceived  as  background  some  of  the 
time.  Interestingly,  binocular  disparities  also  yield  multi-stable  percepts 
with  our  displays.  This  suggests  that  instability  of  figure  and  ground  may  be 
common  when  only  one  or  another  determinant  of  these  perceptions  operating. 
That  is,  the  question  we  should  be  asking  of  all  f igure-ghround  perceptions 
is  "why  are  they  stable?”  rather  than  "why  are  such  and  such  isolated 
configurations  uns table? " 

Our  findings  are  summarized  below.  Also  summarized  at  the  end  of  the 
report  are  studies  of  other  visual  phenomena  that  we  have  been  working  on. 
Reprints  and  preprints  of  work  performed  during  the  granting  years  are 
included  in  an  appendix. 

A .  The  e f fects  of  figure -ground  perception  on  the  spatial .  temporal .  and . 
orientation  response  of  the  visual  system 

1.  The  contrast  sensitivity  response  in  figure  versus  ground  (Wong  and 
We isste in ,  1987) 


Having  found  evidence  that  the  spatial  frequency  response  (thresholds 
for  sharp  versus  blurred  lines)  differs  in  figure  versus  ground  regions  (Wong 
and  Weisstein,  1983)  we  investigated  precisely  how  this  spatial  response 
differed.  We  studied  the  contrast  sensitivity  response  in  figure  and  ground 
regions.  The  f igure -ground  context  was  the  Rubin  faces-vase  ambiguous  picture 
(Figure  1)  covering  an  area  of  6.4  by  6.4  degrees. 


Fip"r»  1  ■  The  Rubin  faoe^-vase  ambiguous  pict"'re  The  sinewave  grating  patch 

was  presented  in  the  miadLe  of  fne  picture  at  the  location  ot  die  tixation 
s  t imulus . 
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The  target  was  a  spatially  and  temporally  Gaussian-modulated  sinewave 
patch  1.75  by  1.75  degrees  and  was  presented  at  the  fixation  point  in  the 
middle  of  the  picture.  The  mean  luminance  of  the  grating  and  the  luminance  of 
the  background  was  20  cd/m2 .  The  observer's  task  was  to  adjust  the  contrast 
of  the  grating  until  it  was  just  visible.  The  spatial  frequency  of  the  target 
ranged  from  .5  cpd  to  16  cpd  in  half  octave  steps.  In  one  block  of  trials, 
the  observer  made  the  adjustment  only  when  the  central  region  was  perceived 
as  figure;  in  another  block  of  trials,  the  observer  made  the  contrast 
adjustment  only  when  the  central  region  was  perceived  as  ground.  (We  did  not 
present  the  target  peripherally  as  we  have  already  established  that  eye 
movements  were  negligible  under  our  fixation  conditions  and  did  not  account 
for  our  effects  [Wong  and  Weisstein  1982,  1983;  see  also  our  metacontrast 
studies  below.]  However,  for  thoroughness,  we  propose  some  peripheral  target 
conditions  in  the  continuation  request.)  We  found  what  we  predicted:  The 
sensitivities  for  gratings  imaged  in  a  figure  region  were  shifted  toward  the 
higher  spatial  frequencies  while  those  for  gratings  imaged  in  a  ground  region 
were  shifted  to  the  lower  spatial  frequencies.  Figure  2a  and  2b  show  the  CSF 
from  two  observers. 


regions.  Data  are  from  two  observers.  Each  data  point  is  based  on  the  mean  of 
ten  adjustments. 

The  functions  we  obtained,  however,  did  not  have  very  prominent  peaks, 
and  there  was  a  good  deal  of  response  variability.  One  of  the  reasons  why  we 
were  unable  to  obtain  more  definitive  CSFs  might  be  our  psychophysical 
method.  We  used  the  method  of  adjustment  in  which  the  observer  adjusted  the 
contrast  of  the  grating  until  it  was  visible.  Under  this  condition  it  might 
be  difficult  for  the  observer  to  make  the  adjustment  while  monitoring  which 
region  was  perceived  as  figure  or  ground.  Moreover,  the  alternating 
perceptions  of  a  region  as  figure  or  ground  may  fluctuate  too  rapidly  for  the 
adjustment  to  be  made  within  the  period  of  time  a  region  was  seen  totally  as 
figure  or  as  ground.  These  problems  prompt  us  to  re-run  the  study  using  the 
temporal  two-alternative  forced-choice  method  in  obtaining  the  contrast 
response.  This  method  has  the  advantage  of  constraining  the  response  to 
within  the  time  period  that  a  region  is  perceived  as  figure  or  ground.  It 
also  reduces  demands  on  the  observer  by  simplifying  the  task  (see  section  D, 
0f  Procedure,  in  the  renewal  request). 
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2.  The  orientation  response  to  figure  and  ground  (Weisstein  and  Wong,  1986) 

Our  organizing  hypothesis  predicts  that  resolution  of  details  such  as 
discrimination  of  the  orientation  of  tilted  lines  will  be  enhanced  in  figure 
regions.  Early  findings  (Weisstein  and  Wong,  1986)  suggested  that  the 
facilitative  effects  of  figure  on  orientation  discrimination  might  be  due  to 
narrowing  of  orientation  bandwidths  in  figure  regions.  Preliminary  data 
showed  that  targets  1.6  degrees  apart  in  orientation  are  detected  and 
discriminated  equally  well  in  figure  regions  but  detection  is  markedly  better 
chan  discrimination  for  the  same  pair  of  targets  presented  in  ground.  Using 
the  assumptions  and  methods  of  the  identic ication/detection  ratio  (Thomas  and 
Gille,  1979;  also  see  below)  we  examined  the  identification/detection  ratios 
of  targets  whose  orientations  differed  by  3.2,  9.9,  9.8,  13.0,  19.5,  25.8, 
and  31.8  degrees.  The  temporal  two-alternative  forced  choice  method  was  used 
to  collect  data.  Each  orientation  difference  yieldeded  discrimination 
thresholds  at  .5,  .6,  .7,  and  .8  detection  thresholds.  From  this  we  estimated 
a  slope  (the  mean  discrimination/detection  ratio)  for  each  orientation 
difference . 

Table  1  shows  discrimination  versus  detection  of  targets  presented  in 
figure  and  ground  expressed  as  identification/detection  (I/D)  slopes.  Slopes 
which  approach  unity  indicate  that  the  pair  of  stimuli  are  at  least  one 
bandwidth  apart  (see  below).  Here  again,  our  results  confirm  our  hypothesis. 


OncmaJion 
differences 
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Ground 
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94 

94 

4  The  slopes  are  calculated  from  the  discrimination. de 
tection  ratios. 


Table  1  :  Identification/detection  (I/D)  slopes  of  targets  presented  alone 
and  in  figure  and  ground  contexts  for  seven  orientation  differences  for  two 
observers . 

The  table  shows  that  I/D  slopes  for  targets  in  figure  regions 
asymptote  towards  unity  at  9.8  degrees;  those  in  ground  regions  approach 
unity  at  31.8  degrees;  and  those  in  contextless  fields  approach  unity  at  25.9 
degrees.  The  orientation  difference  at  which  I/D  slopes  approach  unity 
reflects  the  orientation  bandwidth  of  the  targets  in  our  stimulus  conditions. 
According  to  Thomas  and  Gille  (1979)  and  Thomas,  Gille,  and  Barker  (1979),  to 
the  extent  that  one  or  both  of  a  pair  of  stimuli  affect  one  or  more 
detectors,  the  stimuli  will  be  confused,  thereby  lowering  the  I/D  slope.  When 
discrimination  is  as  good  as  detection,  the  I/D  slope  will  approach  unity  as 
each  target  affects  separate  detectors.  Thus  our  findings  indicate  that 
orientation  bandwidths  narrow  for  targets  perceived  in  figure  regions. 
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3.  The  time-course  of  figure  versus  ground  processing  :  metacontrast  masking 
studies  (Weisstein  and  Wong,  1986,  198/) 

Having  made  headway  into  the  spatial  and  orientation  response  in 
figure  and  ground,  we  turned  our  attention  to  the  time  course  of  responses  in 
figure  and  ground  regions.  Our  prediction  was  that  the  visual  response  in 
figure  would  be  slower  than  that  in  ground.  We  investigated  this  hypothesis 
by  looking  at  the  time-course  of  figure  -  ground  effects  on  tilt 
discrimination.  In  this  experiment  the  observer  discriminated  the  tilt  of  a 
line  segment  (45  degrees  left  or  right  of  vertical)  which  was  flashed  for  50 
msec.  At  various  times  before,  after,  or  simultaneously  with  the  target  the 
Rubin  picture  was  flashed  for  50  msec.  The  target  was  always  presented  in  the 
middle  of  the  picture.  There  were  seven  stimulus  onset  asynchronies  (SOAs)  in 
which  the  Rubin  picture  appeared  before  the  target,  seven  SOAs  in  which  the 
Rubin  picture  appeared  after  the  target,  and  one  condition  in  which  target 
and  Rubin  picture  context  were  presented  together.  The  observer's  task  was  to 
make  a  forced  choice  response  as  to  whether  the  target  line  was  tilted  to  the 
left  or  right,  followed  by  indicating  whether,  during  the  flashed 
presentation  of  the  Rubin  picture,  a  vase  or  two  faces  was  seen.  If  a  vase 
was  perceived,  this  constituted  a  trial  in  which  the  target  was  imaged  in  a 
figure  region.  When  faces  were  perceived,  this  constituted  a  trial  in  which 
the  target  was  imaged  in  a  ground  region.  The  results  are  shown  in  Figure  3. 
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Figure  3  :  The  time-course  of  visual  response  to  targets  presented  in  figure 
versus  ground  regions  revealed  by  masking.  Data  are  from  two  observers. 
Results  from  one  other  observers  show  similar  trends. 


Under  these  conditions,  we  expected  to  plot  the  time -course  of  figure 
facilitation  and  ground  Interference.  Wong  and  Weisstein  (1982)  showed  that 
figure  facilitates  target  discrimination  relative  to  a  target  in  a 
contextless  filed  while  ground  intereferes  with  discrimination.  We  expected 
that  the  figure  facilitation  would  precede  the  ground  interference  in  SOA, 
since  relative  to  a  constant  target  response,  a  faster  contextual  influence 
could  be  presented  later  in  time  for  maximum  effect.  For  SOAs  of  -300  msec 
through  0  msec,  the  tilt  discrimination  of  targets  was  facilitated  by  figure, 
confirming  the  earlier  findings  of  Wong  and  Weisstein  (1982)  and  confirming 
our  prediction  that  figure  facilitation  would  occur  at  negative  SOAs, 
indicating  a  relatively  slow  process.  But  no  corresponding  interfering 
effects  of  ground  were  observed.  Rather,  to  our  great  surprise,  at  positive 
SOAs,  we  found  masking  by  both  figure  and  ground!  This  masking  was  backward 

masking  (metacontrast)  -  the  target  appeared  before  the  masking  context.  No 

metacontrast  masking  has  previously  been  reported  for  the  conditions  we  used 
--  30  min  separation  between  target  and  masking  contours,  SOAs  beyond  200 
msec  (see  Williams  and  Weisstein,  1981;  Breitmeyer,  1984).  These  metacontrast 
conditions,  however,  amply  confirmed  the  hypothesis  that  ground  processing 
was  faster  than  figure  processing.  When  the  masking  contour  was  perceived  as 
the  "background  boundary"  of  the  region  where  the  target  was  presented  (i.e. 
when  the  target  was  perceived  against  ground) ,  the  SOA  yielding  maximum 
masking  occurred  at  about  600  msec.  On  the  other  hand,  when  the  masking, 
contour  was  perceived  as  the  " figure -boundary"  of  the  regions  where  the 
target  was  presented  (i.e.  when  the  target  was  perceived  against  figure),  the 
optimal  masking  SOA  was  between  200  and  300  msec.  According  to  contemporary 
theories  of  metacontrast  masking  (see  Breitmeyer,  1984;  Weisstein  1972,  for 
an  analysis) ,  longer  delays  between  target  and  mask  indicate  faster  mask 
processing:  it  is  as  if  the  mask  must  have  a  very  fast  latency  or  neural 
coiiduction  time  and/or  rise- time  to  "catch  up”  with  the  target.  As  with  our 
detection,  discrimination  and  orientation  results,  it  appears  that  we  are 
looking  at  a  purely  perceptual  effect.  At  an  SOA  of  200  msec,  if  one  sees  the 
region  as  figure,  masking  is  obtained;  if  one  sees  the  region  as  ground;  no 
masking  is  obtained.  Supporting  this  is  the  fact  that  no  metacontrast  masking 
has  ever  been  reported  before  for  these  contour-mask  conditions  (Weisstein, 
1972;  Breitmeyer,  1984).  We  shall  further  examine  this  perceptual 
metacontrast  masking  in  several  experiments  proposed  in  our  continuation 
grant  request. 

B.  The  effects  of  spatial  and  tempora 1  frequency  responses  in  the  visual 
system  on  f igure - ground  perception 

1 .  The  Spatial  Response 

1.1  Spatial  frequency  differences  can  determine  f igure - ground  organization 
(Klymenko  and  Weisstein,  1986) 

So  far  we  had  collected  evidence  that  figure  and  ground  perception 
differentially  influence  the  spatial  and  temporal  response.  Was  the  converse 
true?  Would  the  responses  of  similar  spatial  and  temporal  mechanisms  be 
involved  in  determining  which  area  of  the  visual  field  will  be  perceived  as 
figure  versus  ground? 


We  now  have  overwhelming  evidence  which  supports  the  conjecture  that 
the  spatial  frequency  response  affects  the  perception  of  figure  and  ground. 
Regions  of  ambiguous  pictures  (the  Rubin  picture,  the  Maltese  Cross,  and  a 
bipartite  field;  see  Figure  4)  were  filled  with  sinewave  gratings  of 
different  spatial  frequencies  ranging  from  . 5  to  8  cpd  at  one-octave 
intervals.  These  stimulus  combinations  yield  spatial  frequency  differences 
between  regions  of  the  ambiguous  pictures  ranging  from  one  to  four  octaves. 


Figure  4  :  The  Maltese  Cross  ambiguous  picture  filled  with  sinewave  gratings 
The  higher  spatial  frequency  filled  areas  are  perceived  predominantly  as 
figure  and  the  lower  spatial  frequency  filled  areas  are  perceived 
predominantly  as  ground 

The  combinations  of  spatial  frequencies  in  different  regions  in  the 
picture  yielded  multi-stable  perceptions.  Figure -ground  stability  as  a 
function  of  spatial  frequency  difference  was  measured  by  the  percentage  of 
time  one  of  the  regions  was  perceived  as  figure.  It  was  found  that  regions 
filled  with  the  higher  spatial  frequency  were  perceived  predominantly  as 
figure  and  regions  filled  with  the  lower  spatial  frequency  were  perceived 
predominantly  as  background.  As  the  octave  separation  between  the  regions 
increased,  the  percentage  of  time  the  higher  spatial  frequency  region  was 
seen  as  figure  increased.  These  findings  reveal  that  relative  rather  than 
absolute  spatial  frequency  values  determine  whether  a  region  will  be 
perceived  as  figure  or  ground.  The  results  are  illustrated  in  Figure  5. 


o»  »*cas 


Figure  5  :  The  percentage  of  time  a  region  was  seen  as  figure  versus  ground 
as  a  function  of  the  relative  spatial  frequency  between  the  regions 
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1.2  A  spatial  frequency  effect  on  perceived  depth  (Brown  and  Weisstein, 
1987) 


In  the  previous  study,  a  pronounced  depth  effect  was  observed  with  the 
perception  of  figure  and  ground.  Figure  was  always  perceptually  localized  in 
front  of  ground.  We  next  turned  our  attention  to  this  spatial  frequency- 
induced  depth  effect  (Brown  and  Weisstein,  1987).  We  assessed  the  amount  of 
depth  induced  by  spatial  frequency  differences  by  cancelling  it  with  depth 
induced  s tereoscopically  in  the  opposite  direction.  Crossed  disparity  was 


added  to  one  or  both  regions  of  a  pattern  containing  sinewave  gratings 
differing  in  spatial  frequency.  The  display  consisted  of  rectangular  areas 


filled  with_sinewave  gratings  (see  Figure  6a) 


muh 


Figure  6a  and  6b  :  Display  pattern  for  invest 


effects  on  perceived  depth 


We  found  that  the  regions  with  the  higher  spatial  frequency  were 
perceptually  localized  in  front  of  the  lower  spatial  frequency  regions.  Again 
the  effect  was  dependent  on  the  relative  spatial  frequency  difference  between 
the  regions.  Moreover,  when  spatial  frequency  difference  between  the  regions 
was  greater  than  1.32  octaves,  the  higher  spatial  frequency  region  tended  to 
be  seen  as  foreground  a  greater  percentage  of  the  time  regardless  of  the 
disparity  imposed  on  the  regions,  i.e.  spatial  frequency  difference  dominated 
binocular  disparity  as  a  cue  to  depth.  Using  the  same  configuration,  we  then 
instructed  observers  to  cancel  the  depth  induced  by  spatial  frequency 
differences  between  the  regions  by  adjusting  the  disparity  of  the  image  so 
that  all  regions  within  the  display  lay  on  the  same  depth  plane.  Here  we 
found  that  observers  consistently  placed  the  relatively  lower  spatial 
frequency  filled  regions  closer  in  stereo  depth  than  the  relatively  higher 
spatial  frequency  filled  areas.  Similar  trends  were  observed  when  the 
gratings  were  placed  out-of-phase .  The  display  is  shown  in  Figure  6b.  The 
findings  are  presented  in  Table  2.  Finally,  the  procedure  was  repeated  using 
square  wave  gratings.  Although  depth  was  occasionally  observed,  neither 
region  was  reliably  perceived  in  front  of  the  other. 


Mean  Disparity  Setting 


Phase  Relation  Octave  Separation  in  sf 


ExDeriment 

within  /’ s  &  //’ s 

1,32 

2.0 

332 

2 

IN 

4.8" 

102" 

43.2" 

3 

IN 

-1.1" 

41.3" 

52.8” 

OUT 

1.1” 

30.3" 

45.8" 

4 

IN 

-16.2" 

-5.6" 

-5.6" 

OUT 

-3.9" 

2.5" 

Table  2:  Mean  disparity  settings  when  the  lower  and  higher  spatial  frequency 
regions  appeared  in  the  same  depth  plane.  Positive  settings  indicate  that  the 
relatively  lower  spatial  frequency  regions  were  placed  closer  in  stereo  depth 


1.3  Spatial  frequency,  perceived  depth,  and  f igure - ground  perception  (Wong 
and  Weisstein,  1987b) 


Thus,  we  found  that  spatial  frequency  differences  between  the  regions 
of  the  visual  field  can  be  a  powerful  determinant  of  whether  a  region  will  be 
seen  as  figure  or  ground,  and  that  accompanying  the  figure -ground  perception 
induced  by  spatial  frequency  differences  is  the  visual  impression  that  the 
segregated  areas  are  localized  on  different  depth  planes.  We  next 
investigated  the  joint  roles  of  spatial  frequency  and  perceived  depth  in  the 
perception  of  figure  and  ground  by  stereoscopically  cancelling  perceived 
depth  induced  by  spatial  frequency  differences.  Two  displays  (the  disk- 
annulus  configuration  and  the  diagonal  -  triangles  configuration;  see  Figure  7a 


and  7b)  were  used. 


Figure  7a  and  7b  :  The  disk-annulus  and  the 


.'.s  expected,  when  spatial  frequency  differences  and  perceived  depth 
were  both  present,  the  lower  spatial  frequency  region  was  seen  predominantly 
as  ground  and  was  localized  behind  the  higher  spatial  frequency  region. 
However,  when  no  depth  was  perceived  between  regions  of  the  ambiguous 
pictures  (depth  induced  by  spatial  frequency  difference  being  cancelled  by 
stereoscopic  depth  in  the  opposite  direction),  the  effect  of  spatial 
frequency  differences  alone  on  the  dominance  of  regions  as  figure  and  ground 
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diminished  substantially  although  a  small  residual  effect  remained.  When 
regions  were  filled  with  identical  spatial  frequencies  and  perceived  depth 
was  induced  stereoscopically  between  the  regions  in  the  ambiguous  picture, 
the  region  perceived  farther  away  was  predominantly  seen  as  background. 

(Note:  The  finding  that  the  region  localized  in  back  was  seen  as  background 
is  not  as  obvious  as  it  may  first  appear.  When  we  fixate  on,  say,  a  squirrel 
some  distance  away,  both  the  grass  in  front  of  it  and  the  hills  in  back  of  it 
may  be  said  to  be  ground.)  A  summary  of  these  results  is  presented  in  Figure 
8  . 


Perceived  Depth 


Present 
(at  maximum) 


Absent 


Spatial  F requeues 


Present 
(at  maximum) 


95% 

GOOD 

Exp.  1  and  2 


66% 
POOR 
Exp .  3 


Difference 


Absent 


88% 
GOOD 
Exp .  4 


55% 
NONE 
Exp  .  4 


Figure  8  :  Summary  of  results  showing  the  ioir.t  effects  of  spatial  frequency 
differences  and  perceived  depth  in  determining  the  dominance  of  figure  and 
ground 


Discussion  of  spatial  results 


Here  we  encounter  the  first  amplification  of  the  organizing  hypothesis 
concerning  the  association  between  the  spatial  frequency  response  and  figure- 
ground  perception.  What  determined  whether  a  region  would  be  perceived  as 
figure  or  ground  depended  on  the  spatial  frequency  of  a  particular  area 
relative  to  the  other  spatial  frequencies  in  the  image.  These  findings  of 
relative  effects  of  spatial  frequency  on  f igure -ground  perception  are 
consistent  with  the  point  made  by  many  researchers  that  the  perception  of  the 
visual  world  is  little  affected  by  changes  in  scaling  (see  for  example 
Koenderink  and  van  Doom,  1982;  Burbeck,  1986;  Norman  and  Ehrlich,  1987). 


These  findings  of  relative  rather  than  absolute  spatial  frequency 
influences  may  at  first  glance  seem  to  conflict  with  the  findings  presented 
in  Part  A  above.  There  we  found  a  high  spatial  frequency  response  Co  figure 
(sharp  targets,  rightward  shift  in  the  CSF)  and  a  low  spatial  frequency 
response  to  ground  (blurred  targets,  leftward  shift  in  CSF).  However,  models 
of  relative  spatial  frequency  coding  can  be  readily  inferred  from  our 
overlapping  CSF  functions  and  as  we  obtain  better  estimates  of  these 
functions  we  will  be  able  to  construct  some  plausible  theories  which  are 
compatible  with  both  sets  of  data.  In  addtion,  differences  in  picture 
contexts  or  grating-patch  size  (see  Methods  of  Procedure  in  the  continuation 
request)  may  produce  shifts  in  the  CSFs  due  to  image  scaling,  and  these 
shifts,  too,  would  enter  into  our  theoretical  picture. 
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Filially,  we  have  seen  that  perceived  depth  appears  to  mediate  the 
perception  of  figure  and  ground  (and,  interestingly,  that  perceived  depth 
yields  a  multistable  perception  in  cur  conditions  whether  it  is  generated  by 
spatial  frequency  differences  or  by  stereoscopic  disparities) .  That  depth  and 
figure  -  ground  coding  are  closely  associated  has  been  often  noted.  We  will  not 
propose  any  specific  studies  that  further  explore  perceived  depth  and  figure- 
ground  relationships  for  example,  conflicting  monocular  cues  to  spatial - 
trequency  induced  depth  -  but  we  will  keep  this  issue  in  mind  and  will  always 
track  perceived  depth  mediation  of  figure  and  ground  as  we  investigate 
further  stimulus  variables  that  induce  such  perceptions. 

An  interesting  sideline  to  our  findings  of  relative  spatial  frequency 
effects  in  figure-ground  perception  is  that  the  higher  spatial  frequency 
region  (thinner  bars)  was  perceived  as  figure  lying  in  front  of  the  lower 
spatial  frequency  region  (wider  bars).  This  appears  to  contradict  the  common 
knowledge  of  size-distance  scaling.  Although  no  specific  experiments  dealing 
directly  with  this  issue  are  included  In  the  continuation  proposal,  we  shall 
keep  this  in  mind  when  we  investigate  the  scaling  effects  described  above. 

2 .  The  Temporal  Response 

While  our  predictions  relating  spatial  frequency  to  figure -ground, 
perception  proceeded  from  our  general  organizing  hypthesis  without  initial 
experimental  exploration,  our  study  of  the  role  of  the  temporal  response  in 
figure-ground  perception  was  based  on  some  observations  regarding  flicker  and 
perceived  depth.  In  a  pre-proposal  report  of  preliminary  findings,  we 
described  a  "flicker-induced  depth"  effect.  We  observed  that  flickering 
regions  of  a  random-dot  field  tended  to  be  perceptually  localized  behind 
nonflickering  regions.  This  phenomenon  was  investigated  more  systematically 
during  the  funding  period.  We  examined  the  relations  among  temporal  frequency 
(as  flicker),  perceived  depth  induced  by  the  flicker,  and  figure-ground 
perception . 
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2.1  Flicker  -  induced  depth  and  related  effects  (Wong  and  Weisstein,  1984, 
1985) 


These  effects  were  described  in  the  preliminary  report  in  the  proposal 
and  we  shall  only  mention  them  briefly  here.  Flickering  a  region  makes  it 
look  as  if  it  were  behind  a  nonflickering  region.  Observers  often  report  that 
the  flickering  regions  appeared  like  backgrounds  shimmering  behind  the  static 
regions.  This  effect  is  very  robust  and  can  be  obtained  regardless  of  the 
stimulus  pattern  in  the  flickering  and  nonflickering  areas.  In  addition, 
there  was  a  range  of  optimal  flicker  frequencies  (6  to  8  Hz)  at  which  maximum 
depth  was  observed.  The  effect  diminished  as  flicker  frequencies  increased  or 
decreased.  The  effects  were  also  maximal  when  the  flickering  regions  were 
modulated  at  100%  luminance.  However,  the  flicker- induced  depth  effect  was 
not  attributable  to  luminance  or  perceived  brightness  difference  between  the 
flickering  and  nonflickering  regions.  Theories  of  brightness  as  a  cue  to 
depth  predict  that  the  brighter  region  would  be  perceived  nearer  than  the 
dimmer  region  (Ittelson,  1960).  However,  in  our  studies,  regions  which  were 
flickering  but  brighter  than  the  nonflickering  regions  were  perceptually 
localized  as  farther  away. 
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2.2  The  effects  of  flicker  on  the  perception  of  figure  and  ground  (Wong  and 
Weisstein,  1987a) 


In  this  study  we  found  that  flickering  regions  of  an  ambiguous  picture 
were  perceived  predominantly  as  backgrounds,  and  adjacent  nonflickering 
regions  were  perceived  predominantly  as  figures.  The  displays  together  with 
the  visual  impression  they  created  are  shown  in  Figure  9a  and  9b.  This  was 
true  whether  or  not  the  regions  were  outlined  by  contours  or  merely  defined 
by  temporal  changes.  This  effect  of  "flicker- induced  ground"  was  optimal  when 
the  flickering  frequencies  occurred  between  6  and  8  Hz.  Maximum  perceived 
depth  segregation  between  the  flickering  and  nonflickering  regions  also 
occurred  at  these  flicker  rates.  At  lower  (1.4  Hz)  and  higher  flicker  (12.5 
Hz)  rates,  regions  maintained  their  segregation  but  the  dominance  of  a  region 
as  figure  or  ground  and  the  depth  segregation  between  the  flickering  and 
nonflickering  regions  diminished.  This  tuning  function  of  flicker- induced 
ground  appears  to  be  similar  to  those  of  visual  pathways  most  sensitive  to 
high  temporal  frequency.  This  suggests  a  close  relation  between  the  high 
temporal  frequency  response  and  ground  perception.  Sample  findings  are 
presented  in  Figure  10. 


Flickering 
Plonking  Regions 


Flickering 
Central  Region 


Figure  9a  and  9b  :  Displays  used  in  examining  the  effect  of  flicker  on 
f igure - ground  perception  and  the  visual  impression  they  created 
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Figure  10  :  The  percentage  of  time  a  region  was  seen  as  figure  is  plotted  as 
a  function  of  temporal  frequency.  The  figure  shows  the  ambiguous  picture  used 

2.3  Flicker,  perceived  depth,  and  figure  -  ground  perception  (Wong  and 
Weisstein,  1987c) 

As  with  spatial  frequency,  temporal  frequency  differences  also  induce 
depth.  We  next  examined  the  joint  roles  of  flicker  and  perceived  depth  on  the 
perception  of  figure  and  ground  by  stereoscopically  cancelling  the  depth 
induced  by  the  flickering  region.  (This  experiment  was  similar  to  1.3  above 
using  temporal  rather  than  spatial  frequency) .  The  figure-ground  context  was 
the  diagonal -triangles  picture  filled  with  random  dots  (see  Figure  9b).  The 
percentage  of  time  a  flickering  region  was  perceived  as  ground  was  measured 
for  four  temporal  frequencies  (1.4,  6.3,  8.3,  and  12.5  Hz).  Binocular 
disparity  was  introduced  into  the  regions  and  the  amount  of  relative 
disparity  required  to  cancel  perceived  depth  induced  by  flicker  at  each  of 
the  flicker  frequencies  was  recorded.  We  also  tested  binocular  disparity 
alone,  without  flicker,  as  a  cue  to  figure-ground  and  perceived  depth.  The 
percentage  of  time  the  flickering  regions  were  perceived  as  background  was 
obtained  when  the  flickering  and  nonflickering  regions  were  perceived  as 
coplanar.  When  flicker  was  absent  we  introduced  depth  stereoscopically  and 
again  measured  the  percentage  of  time  the  region  localized  farther  away  was 
seen  as  background.  Results  indicate  that  when  perceived  depth  between 
regions  was  absent,  the  effect  of  flicker  on  the  perception  of  a  region  as 
ground  diminished  substantially.  When  perceived  depth  (produced 
stereoscopically)  without  flicker  was  present,  the  region  localized  in  front 
was  predominantly  seen  as  figure  and  the  region  localized  farther  away  was 
predominantly  seen  as  ground.  Note  that  in  this  condition  without  flicker  but 
with  stereoscopic  depth,  a  mutli-stable  perception  was  also  produced.  These 
findings  are  shown  in  Figures  11  and  12  respectively. 
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Figure  11:  The  percentage  of  time  the  flickering  regions  were  perceived  as 
background  when  perceived  depth  was  present  versus  when  perceived  depth  was 
absent 


IIIIIlH  8»MMII  «<«<•  »< 


Figure  12  :  The  percentage  of  time  regions  localized  farther  away  was  seen  as 
ground  as  a  function  of  relative  disparity 

Discussion  of  temporal  results 

In  general,  our  hypothesis  relating  high  temporal  frequency  to  ground 
perception  was  supported.  However,  contrary  to  our  results  with  spatial 
frequency  and  figure -ground,  flicker- induced  ground  appeared  to  have  a  tuning 
function,  with  the  peak  centered  around  6  to  8  Hz.  However,  given  the 
findings  that  will  be  reported  below  it  is  premature  to  conclude  that  the 
temporal  response  in  figure  -  ground  perception  is  absolute  rather  than 
relative  in  nature.  In  our  experiments  the  flickering  regions  were  always 
seen  in  relation  to  static  regions.  When  regions  are  flickered  at  different 
rates,  however,  temporal  tuning  appears  to  diminish  (see  below). 

Furthermore,  random-dot  displays  contain  spatial  frequencies  across  a  wide 
range  of  the  frequency  spectrum.  Thus,  in  random-dot  displays  we  might  be 
observing  the  temporal  response  of  a  number  of  spatial  frequency- tuned 
mechanisms  together.  Finally,  we  used  on-off  flicker.  Sinusoidal  flicker  of  a 
single  sinewave  may  yield  stull  different  results.  In  the  next  section  we 
report  effects  of  flicker  on  figure  and  ground  in  conditions  in  which  the 
display  contained  only  one  spatial  frequency  at  a  time.  However,  square-wave 


flicker  was  still  used. 


Again,  we  would  like  to  comment  on  the  fact  that  these  temporal 
results,  like  our  spatial  frequency  results,  contradict  the  notion  of 
temporal  changes  as  cues  to  distance.  If  flow  fields  (as  in  motion  parallax) 
are  regarded  as  cues  to  distance,  viz.  that  faster  moving  fields  are 
localized  nearer  the  observer  than  slower  moving  fields  (Farber  and  McConkie, 
1979;  McConkie  and  Farber,  1979;  Rogers  and  Graham  1982),  one  would  expect 
that  perceived  distance  should  conform  to  some  monotonic  relation  to  the  rate 
of  temporal  change.  However,  our  findings  on  the  relation  between  temporal 
frequency  and  perceived  depth  were  that  perceived  depth  first  increased  with 
flicker  rate  until  8  Hz  and  declined  again,  showing  an  inverted  U-shape 
function  of  flicker  rates.  These  findings  suggest  that  the  role  of  temporal 
change  in  perceived  depth  as  well  as  in  figure  -  ground  perception  is  more 
complex  than  previously  thought. 


3 .  The  Spatio - tempota 1  Response 

So  far  we  have  been  investigating  the  figure-ground  response  to 
spatial  and  temporal  frequency  by  manipulating  these  variables  separately  and 
observing  their  effects.  We  have  found  that  relative  spatial  frequency 
differences  affect  which  region  of  the  visual  field  will  be  seen  as  figure 
and  which  will  be  seen  as  ground.  We  have  also  found  that  a  temporal  tuning 
frequency  of  6  to  8  Hz  causes  a  flickering  region  to  be  optimally  seen  as 
ground.  What  about  the  combined  effects  of  both  spatial  and  temporal 
variables? 


3.1  Spatial  and  temporal  frequency  and  figure -ground  organization  (Klymenko, 
Weisstein,  and  Topolski,  1987) 


First,  we  asked  the  question,  does  flicker  induced  ground  perception 
change  when  the  uniform  spatial  frequencies  of  an  image  increase?  For  each 
of  four  spatial  frequencies,  we  investigated  the  effects  of  relative  flicker 
on  the  perception  of  figure  and  ground.  We  used  the  Maltese  Cross  ambiguous 
picture  (see  Figure  13).  The  diameter  of  the  circular  area  enclosing  the 
Maltese  Cross  was  5.66  degrees.  The  mean  luminance  of  the  area  occupied  by 
the  Maltese  Cross  pattern  was  39.0  cd/m2 .  The  surrounding  background  was  dark 
(9  cd/m2) . 


Figure  13  :  The  Maltese  Cross  filled  with  smewave  gratings. 
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There  were  four  flicker  rates  (0,  3.75,  7.5,  15  Hz)  and  four  spatial 
frequencies  (0.5,  1,  4,  8  cpd) .  We  used  square-wave  on-off  flicker,  in  which 
the  grating  pattern  alternated  with  a  blank  field  whose  luminance  equaled  the 
mean  luminance  of  the  grating.  The  observers  first  matched  the  flickering 
gratings  (3.75,  7.5,  15  Hz)  to  a  stationary  grating  at  each  of  the  spatial 
frequencies  (.5,  1,  4,  8  cpd).  These  were  the  contrast  settings  for  the  main 
experiment  in  which  observers  monitored  the  time  in  which  a  particular  region 
of  the  display  was  perceived  as  figure  versus  background  for  30  seconds.  The 
observer  moved  a  joystick  right  or  left  to  indicate  whether  the  cross 
oriented  left  or  right  of  vertical  was  seen  as  background.  There  was  also  a 
"null"  response,  indicated  by  the  "middle"  position  of  the  joystick,  in  which 
neither  cross  was  perceived  as  figure.  We  recorded  the  mean  percent  response 
rime  observers  saw  the  rightward- tilted  cross  as  the  background.  The  data  are 
shown  in  Figure  14. 
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Figure  14:  The  effects  of  relative  temporal  frequency  on  the  perception  of 
figure  and  ground  for  four  spatial  frequencies  (0.5,  1,  4,  8  cpd).  Figure 
legends  are  contained  within  the  top- left  bottom-right  diagonal  through  the 
matrix.  Each  cell  contains  data  from  the  four  spatial  frequencies  used 
(beginning  from  the  top  line  :  8,  4,  1,  0.5  cpd).  Above  the  diagonal  are  the 
stimuli  in  which  the  rightward  tilting  cross  contained  the  higher  temporal 
frequencies.  Below  the  diagonal  are  the  stimuli  in  which  the  rightward- 
tilting  cross  contained  the  lower  temporal  frequency.  The  mean  percent 
response  time  that  the  rightward  cross  was  seen  as  background  is  in  the  right 
column  of  each  cell.  (We  also  recorded  absolute  response  time,  shown  in  the 
left  column  of  the  cell,  but  space  does  not  allow  discussion  of  this  here.) 
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The  results  show  that  in  general,  the  cross  with  the  high  temporal 
frequency,  regardless  of  the  spatial  frequency  in  the  regions,  was  perceived 
primarily  as  background.  As  the  temporal  frequency  difference  increased,  the 
effect  of  the  high  flicker  rate  on  determining  a  region  as  background  also 
tended  to  increase  for  displays  with  spatial  frequencies  at  1  and  8  cpd.  The 
results  for  the  0.5  and  4  cpd  conditions  were  unclear.  It  appeared  that  the 
flicker  "tuning  function"  for  figure-ground  perception  is  different  at 
different  spatial  frequencies.  This  is  revealed  by  comparing  the  percentage 
of  time  a  flickering  region  was  seen  as  background  for  3.75,  7.5,  and  15  Hz 
as  a  function  of  spatial  frequency  of  the  display  in  a  condition  in  which  the 
other  cross  is  static. 


We  next  held  temporal  frequency  constant  within  conditions  while 
varying  spatial  frequency.  In  one  condition,  the  two  crosses  were  filled  with 
sinewave  gratings  of  1  and  4  cpd  respectively.  In  the  second  condition,  they 
were  filled  with  gratings  of  1  and  8  cpd  respectively .  For  each  of  these  two 
spatial  frequency  combinations,  there  were  four  temporal  frequencies  at  which 
the  entire  pattern  flickered:  0,  3.75,  7.5,  and  15  Hz.  When  patterns 
flickered,  the  "on"  half  cycle  on  one  cross  coincided  with  the  "off"  half 
cycle  of  the  other  cross.  The  percent  response  time  that  the  rightward- 
tilting  crosses  were  seen  as  background  was  recorded.  The  results  are  shown 
in  Figure  15. 
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Figure  15:  The  effect  of  temporal  frequency  on  figure-ground  perception  when 
regions  differ  in  spatial  frequency.  In  the  top  half  of  each  cell  are  the 
stimuli  in  which  the  rightward- tilting  cross  contained  the  relatively  lower 
spatial  frequency  sinewave  grating,  and  in  the  bottom  half  are  the  stimuli  in 
which  the  rightward- tilting  cross  contained  the  higher  spatial  frequency 
grating.  The  mean  percent  response  time  is  shown  for  each  temporal  frequency 
in  which  the  rightward- tilting  cross  was  perceived  as  background. 


The  results  indicate  that  when  flicker  was  absent,  the  low  spatial 
frequency  region  was  seen  predominantly  as  background .  This  is  consistent 
with  the  previous  findings  of  Klymenko  and  Weisstein  (1986)  and  Wong  and 
Weisstein  (1987b).  However,  when  the  patterns  flickered,  the  dominance  of  the 
low  spatial  frequency  region  as  background  decreased.  This  was  especially 
noticeable  when  the  regions  differed  by  two  octaves  of  spatial  frequency. 
These  findings  suggest  that  the  temporal  response  over-rides  the  spatial 


Mfe  i 


frequency  response  in  f igure - ground  perception. 

3.2  Contrast  reversal  flicker,  spatial  frequency,  and  figure -ground 
perception  (Klymenko  and  Weisstein,  1987b) 

We  next  ran  the  same  temporal  and  spatial  frequency  conditions  using  a 
circular  field  divided  evenly  into  two  regions.  The  entire  circle  had  a 
diameter  of  5.66  degrees.  The  mean  luminance  of  the  circular  region  was  39.0 
cd/m2 .  The  surrounding  background  was  9.0  cd/m2.  The  experimental  design  was 
identical  to  the  previously-described  study  except  that  instead  of  on-off 
flicker,  contrast  reversal  flicker  was  used.  The  flickering  gratings 
underwent  contrast  reversal  with  a  square  temporal  waveform.  For  the  patterns 
in  which  both  regions  flickered,  the  temporal  onsets  of  the  two  gratings 
coincided.  The  results  are  shown  in  Figure  16.  Four  matrices  of  data  show  the 
effect  of  contrast  reversal  flicker  on  figure-ground  perception  when  the 

spatial  frequencies  of  the  display  were  0.5,  1,4,  and  8  cpd. 

TEMPORAL  FREQUENCY  (Hz)  OF 
RIGHT  SEMICIRCLE 
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EXPERIMENT  2 

Figure  16:  The  cells  show  mean  percent  response  time.  Above  the  top- left, 
bottom- right  diagonal  are  the  responses  for  the  right  semicircle  when  it  is 
undergoing  the  higher  rate  of  contrast  reversal  flicker.  Below  the  diagonal 
are  the  responses  for  the  right  semicircle  when  it  is  undergoing  the  lower 
rate  of  contrast  reversal  flicker.  The  lowest  cell  of  the  diagonal  indicates 
the  spatial  frequency  of  the  sinewave  gratings  tested  in  the  condition 

In  general,  the  semicircle  undergoing  the  highest  rate  of  contrast 
reversal  flicker  was  perceived  as  background  more  often  than  the  region  with 


the  lower  rate.  The  percentage  of  time  the  region  with  the  higher  flicker 
rate  was  seen  as  background  increased  as  the  difference  between  the  flicker 
rates  increased.  This  trend  occurred  for  all  the  spatial  frequency 
conditions.  Also,  as  in  the  previous  experiment,  temporal  freaquency  over¬ 
rides  spatial  frequency  when  the  spatial  frequencies  vary  among  regions  and 
the  entire  display  is  flickered.  Overall,  the  results  were  similar  to  the 
studies  using  on-off  flicker. 

3.3  Stereopsis  and  flicker  -  induced  depth  (Klymenko  and  Weisstein,  1987c) 

We  next  investigated  the  perceived  depth  segregation  induced  by 
flicker  using  a  stereoscopic  cancellation  technique  similar  to  the  ones  used 
by  Brown  and  Weisstein  (1987)  and  Wong  and  Weisstein  (1987b, c).  The  display 
was  the  ambiguous  picture  consisting  of  two  Maltese  Crosses  (see  Figure  13, 
p.  17)  covering  a  circular  area  with  a  diameter  of  5.66  degrees.  The  two 
crosses  were  filled  with  sinewave  gratings.  The  leftward- tilting  cross  was 
static  and  it  was  filled  with  a  sinewave  grating  of  1.67  cpd.  This  was  the 
region  with  adjustable  disparity.  The  other  cross  was  the  "test"  pattern.  In 
one  condition  it  was  filled  with  a  1  cpd  grating;  in  the  second  condition  it 
was  filled  with  a  4  cpd  grating.  Two  types  of  square-wave  flicker  were  used  ; 
on-off  flicker  and  contrast  reversal  flicker.  There  were  three  flicker  rates 
for  each  type  of  flicker  :  3.75,  7.5,  15  Hz.  During  the  trial,  when  the  test 
pattern  flickered,  the  observers  adjusted  the  disparity  of  the  static  region 
such  that  the  flickering  and  nonflickering  crosses  appeared  coplanar. 
Stereopsis  was  created  using  typical  anaglyphic  methods  in  which  one  eye 
viewed  the  display  through  a  red  filter  and  the  other  eye  viewed  the  display 
through  a  green  filter.  The  results  are  shown  in  Table  3.  Observers  had  to 
set  regions  stereoscopically  in  a  depth  direction  opposite  to  that  induced  by 
flicker.  The  region  with  the  higher  relative  flicker  was  always  set 
stereoscopically  behind  the  other  region  before  both  regions  were  perceived 
as  coplanar. 


Temporal 

Individual 

Condition  Frequency  (Hz) 

Disparity 

Setting 

Comparison  Test 

Mean 

SD 

Stationary  (control) 

0 

+.0150 

.0518 

Contrast  reversal 

3.75 

+.0104 

.0493 

0.26 

7.5 

-.0208 

.0581 

2.06 

15.0 

-.0346 

.0619 

2.85* 

On-off  flicker 

3.75 

- .0144 

.0738 

1.69 

7.5 

-  .0599 

.0661 

4.30** 

15.0 

- .0300 

.0528 

2.58* 

Experiment  2 

Temporal 

Individual 

Condition  Frequency  (Hz) 

Disparity 

Setting 

Comparison  Test 

Mean 

SD 

Stationary  (control) 

0 

+.0082 

.0616 

Contrast  reversal 

3.75 

+.0054 

.0351 

0.19 

7.5 

-.0233 

.0571 

2.14 

15.0 

-.0409 

.0705 

3.34** 

On-off  flicker 

3.75 

-.0074 

.0636 

1.06 

7.5 

-.0303 

.0589 

2.62* 

15.0 

-.0186 

.0623 

1.82 

Table  3  :  The  disparity  settings  required  to  null  flicker- induced  depth 
each  flicker  condition  for  a  1  cpd  and  a  4  cpd  stimulus.  Disparity  sett 
are  given  in  degrees  of  visual  angle.  Positive  numbers  represent  crosses 
disparity  and  negative  numbers  represent  uncrossed  disparity.  Disparity 
settings  less  than  -.0052  indicate  that  the  adjustable  region  was  set 
stereoscopically  behind  the  "test"  pattern. 

Discussion  of  Spatio- temporal  results 


The  results  of  the  spatio-temporal  studies  lead  to  two  amplif ications 
of  our  organizing  hypothesis.  First,  it  appears  that  the  effect  of  relative 
flicker  on  the  perception  of  figure  and  ground  depends  on  the  spatial 
frequency  content  of  the  region.  Second,  it  appears  that  the  effects  of 
spatial  and  temporal  frequency  on  f igure - ground  and  depth  perception  are  not 
equally  balanced.  Rather,  when  flicker  is  introduced  into  a  pattern  whose 
regions  vary  in  spatial  frequency,  the  perception  of  depth  and  figure  versus 
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ground  diminished  or  disappeared.  The  dominance  of  flicker  over  spatial 
frequency  in  inducing  f igure - ground  perception  suggests  that  temporal  changes 
might  emerge  as  a  more  important  determinant  of  figure  -  ground  perception  than 
spatial  changes.  This  leads  to  further  questions  not  only  about  the  role  of 
flicker  (as  temporal  frequency)  in  figure-ground  perception  but  that  of 
motion  (in  angular  velocity)  as  well.  Finally,  because  we  are  dealing  with  a 
perceptual  effect,  we  need  to  examine  simultaneous  changes  in  flicker  rate  or 
angular  velocity  and  spatial  frequency.  The  perceptions  arising  from  such 
changes  cannot  be  predicted  by  simply  holding  one  of  the  variables  constant 
and  changing  the  other  because  we  cannot  assume  linear  effects.  Varying  both 
temporal  and  spatial  frequencies  simultaneously  will  allow  us  to  chart  a 
spatio-temporal  tuning  response  surface  of  figure-ground  perception. 

4 ■  The  Motion  Response 

We  have  just  begun  to  explore  the  role  of  angular  velocity  in  motion 
in  the  percepton  of  figure  and  ground.  Below  we  report  the  findings  of  a 
preliminary  study. 

4.1  The  role  of  velocity  of  moving  fields  in  the  perception  of  figure  and 
ground  (Wong  and  Weisstein,  1987d) 

In  this  study  we  investigated  how  the  velocity  of  moving  fields 
affected  the  perception  of  a  region  as  figure  versus  ground.  We  used  a 
display  consisting  of  a  center  and  surround  region  filled  with  sinewave 
gratings.  The  display  is  shown  in  Figure  7a  (p.ll).  The  gratings  were  set  to 
a  spatial  frequency  of  1  cpd.  One  region  of  the  display  was  always  stationary 
while  the  other  region  moved  at  .5,  1,  2,  4,  8,  or  16 ,  degrees  per  second. 

The  observer  was  instructed  to  monitor  figure  -  ground  perception  in  the  same 
way  described  in  the  other  experiments.  We  found  that  as  the  velocity 
increased,  the  moving  grating  was  seen  as  ground  more  often  than  the 
stationary  one.  The  effect  reached  a  plateau  at  8  deg/sec  and  no  increase  was 
observed  at  16  deg/sec.  The  data  suggest  that  the  motion  response  in  the 
figure-ground  system  might  have  an  absolute  upper  velocity  limit.  This 
conjecture  is  reasonable  given  that  images  moving  at  high  velocities  appear 
blurred  and  fused  (Graham,  1966;  Smith  1987).  At  velocities  between  15  and  30 
deg/sec,  directionality  of  motion  can  still  be  discriminated  for  a  1  cpd 
grating,  but,  perceptually,  velocity  does  not  seem  to  be  specifiable  (Smith 
1987).  Thus,  in  our  displays,  the  fast-moving  images  might  be  "treated" 
equally  by  the  motion  response  in  the  figure-ground  system  as  blurred  images. 
Indeed,  when  we  moved  the  gratings  at  velocities  of  24  and  32  deg/sec 
observers  reported  blurred  or  streaking  in  the  images.  This  suggests  that  the 
motion  response  in  f i gure - ground  perception  has  a  high  velocity  limit.  Figure 
17  summarizes  the  results. 
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Figure  17  :  The  percentage  of  time  a  moving  region  is  perceived  as  background 
as  a  function  of  angular  velocity. 

While  we  cannot  make  any  conclusive  statements  about  the  relation 
between  the  role  of  motion  and  flicker  in  figure  -  ground  perception  given 
these  exploratory  findings,  we  can,  however,  make  some  comparisons  between 
these  data  and  our  earlier  findings.  In  particular,  we  can  compare  these 
findings  to  those  of  Klymenko,  Weisstein,  and  Topolski  (1987)  in  the 
condition  in  which  a  region  filled  with  a  1  cpd  grating  was  flickered  at 
3.75,  7.5,  and  15  cpd.  The  percentage  of  time  the  flickering  region  was  seen 
as  background  showed  an  inverted  U-shaped  function  with  a  prominent  peak  at 
7.5  Hz.  This  resembles  the  temporal  tuning  function  obtained  by  Wong  and 
Weisstein  (1987a).  However,  the  "velocity"  function  of  "motion-induced 
ground"  obtained  in  our  preliminary  study,  while  showing  the  same  monotonic 
increase  in  the  percentage  of  time  the  moving  region  was  seen  as  ground  as  a 
function  velocity  up  to  8  deg/sec,  continues  to  stay  high  as  angular  velocity 
increases.  Since  the  spatial  frequency  of  the  display  used  in  this  experiment 
is  1  cpd,  we  can  convert  the  angular  velocities  into  temporal  frequencies  and 
directly  compare  the  Klymenko  Weisstein  and  Topolski  (1987)  flicker  data 
(The  comparison  is  not  perfect  since  Klymenko  et  al  used  square  wave  flicker, 
while  in  this  study  the  moving  sinewave  gratings  oroduced  a  sinusoidal 
temporal  change  on  the  retina.)  Figure  18  com  ares  the  results  of  the  three 
studies.  These  findings  reveal  complex  relations  between  flicker,  motion,  and 
spatial  frequency  in  the  perception  of  figure  and  ground  which  we  propose  to 
study  further  in  our  continuation  proposal.  «. — .  « 
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Figure  18:  The  tuning  functions  of  " f 1  icker - i nduced  ground"  compared  with 
that  of  "motion- induced  ground" 


C .  Related  Studies 

In  addition  to  figure-ground  investigations  described  above,  we  have 
also  conducted  a  number  of  related  studies  which  are  described  briefly  below. 

1 . 1  Experiments  on  Visual  Phantoms 

Brown  and  Weisstein  (1987a)  conducted  a  number  of  experiments 
investigating  a  phenomenon  of  moving  visual  phantoms  first  discovered  by 
Tynan  and  Sekuler  (1975) .  Two  experiments  examined  the  influence  of  two 
sensory  factors  on  phantom  visibility  and  perceived  f igure - ground  segregation 
(Brown  and  Weisstein,  1985,  1988).  Phantom  visibility  was  systematically 
affected  by  manipulating  the  lightness  of  the  occluder  (the  region  where  the 
illusion  appears)  and  the  way  an  inducing  grating  was  flickered.  Peak 
flickering  phantom  visibility  occurred  when  occluder  lightness  was  the  same 
as  the  nonflickering  stripes  of  a  black  and  white  square-wave  inducing 
grating.  Phantom  visibility  was  optimal  for  a  dark  occluder  when  a  standard 
on-off  flicker  (Type  I  flicker)  was  used,  that  is,  when  the  off  portion  of 
the  flicker  cycle  was  a  blank  black  field.  When  a  reverse  on-off  flicker 
(Type  II  flicker)  was  used,  (the  "off"  portion  of  the  flicker  cycle  being  a 
blank  white  field)  phantom  visibility  was  optimal  for  a  light  occluder. 
Phantom  visibility  was  consistently  absent  when  a  gray  occluder  was  used. 

With  a  gray  occluder,  the  presence  of  clear  edges  between  the  inducing 
grating  and  occluder  unambiguously  defined  an  occluding  surface,  blocking  the 
assimilation  of  brightness  across  the  occluder  (Grossberg  and  Mingolla, 

1985).  The  early,  low- level  brightness  information  could  not  be  overcome  by 
higher-order  interpretational  or  representational  processes.  When  occluder 
lightness  was  similar  to  either  the  light  or  dark  grating  stripes  the  low- 
level  brightness  information  was  ambiguous  allowing  for  the  possibility  of 
phantoms.  When  occluder  lightness  was  ambiguous,  an  influence  of 
f rickering/nonflickering  relations  of  the  grating  stripes  became  apparent.  In 
both  experiments  phantom  visibility  was  optimal  when  the  stripes  completing 
as  phantoms  were  not  flickering.  This  perception  of  nonflickering  regions  as 
a  figure  in  front  of  the  flickering  regions  is  consistent  with  our  results 
regarding  flicker,  figure  -  ground  identification,  and  the  perception  of  depth 
(see  section  above  on  figure  -  ground  perception  and  the  temporal  response) 
even  though  portions  of  the  perceived  figure  were  illusory. 

Using  interposition  as  a  cue  to  create  different  f igure - ground  and 
depth  relations  within  various  phantom  inducing  patterns,  phantom  visibility 
was  measured  while  the  local  inducing  environment  of  the  patterns  was  kept 
the  same  (Brown,  1986,  Brown  and  Weisstein,  1986;  1987a).  Two  measures  of 
phantom  visibility  were  recorded:  phantom  strength  (percentage  of  viewing 
time  phantoms  were  visible)  and  incubation  time  (time  elapsed  on  each  trial 
before  phantoms  were  first  reported).  When  the  inducing  pattern  specified  the 
phantom  inducing  regions  as  ground  instead  of  figure,  phantom  strength  was 
reduced  by  one  half  and  incubation  time  was  nearly  doubled.  These  results 
indicate  a  link  between  phantom  indu  ion  and  the  representation  of  figure 
and  ground.  When  conflicting  f igure - giound  and  depth  information  is  present 
the  likelihood  that  the  visual  system  will  complete  the  representation 
dec  1 ines . 


A  series  of  experiments  showing  that  the  perception  of  phantoms  can 
affect  the  visibility  of  physically  present  targets  further  supports  the 
conjecture  that  phantoms  are  perceived  and  represented  as  figural  regions 
(Brown  and  Weisstein,  1988).  Tilt  discrimination  of  a  line  segment  was  found 
to  be  superior  in  phantom  versus  non-phantom  regions,  even  though  both 
regions  were  physically  the  same.  There  was  no  difference  in  performance 
between  these  regions  when  phantoms  were  not  visible.  These  results 
paralleled  those  found  by  Wong  and  Weisstein  (1982)  on  the  discrimination  of 
tilted  line  segments  presented  in  figure  and  ground  regions  of  an  ambiguous 
picture . 

1.2  Motion- induced  contours 

1.21  Type  of  motion 

Klymenko  and  Weisstein  (1984)  reported  that  moving  illusory  contours 
may  be  functionally  defined  in  terms  of  the  type  of  motion  which  enhances 
them.  They  found  that  the  motion- induced  contour,  an  illusory  dihedral  edge, 
is  enhanced  by  three-dimensional  motion,  but  not  by  two-dimensional  motion. 
Rotation- in-depth  induced  superior  contour  perceptibility  compared  to  a 
stationary  image,  but  rigid  image  motion  did  not.  On  the  other  hand,  moving 
monohedral  edges  (e.g.,  see  Tynan  and  Sekuler,  1975)  may  be  induced  or 
enhanced  by  two-dimensional  motion,  but  three  dimensional  motion  does  not 
enhance  contour  perceptibility  s ignif icantly  more.  Klymenko  and  Weisstein 
(1984)  also  found  that  contour  perceptibility  of  the  motion  induced  contour 
is  inversely  correlated  with  dihedral  angle  size,  from  180  degrees  to  about 
45  degrees.  The  contour  is  more  perceptible  for  smaller  dihedral  angles. 

1.22  Further  three-dimensional  motion 

Since  it  has  been  well  established  that  the  motion- induced  contour  is 
due  to  three-dimensional  motion,  specifically  rotation- in-depth ,  Klymenko, 
Weisstein,  and  Ralston  (1987)  tested  additional  three-dimensional  image 
transformations.  They  found  that  looming  ( translation- in-depth  under  polar 
projection)  did  not  produce  a  perceptible  contour  compared  to  rotation  in 
depth.  This  is  interesting  in  light  of  the  fact  that  human  observers  cannot 
recover  the  spatial  structure  of  a  looming  object,  but  they  can  recover  the 
spatial  structure  of  a  rotating  object  (the  well-known  kinetic  depth  effect). 
The  geometry  ot  the  looming  transformation  was  analyzed  as  follows.  The 
motion  in  a  looming  image  can  be  geometrically  decomposed  into  two  motion 
components:  a  similarity  transformation  or  size  change,  and  a  perspective 
change .  It  was  suggested  (also  Klymenko  and  Weisstein,  1987a)  that  human 
inability  to  recover  structure  from  looming  was  due  to  two  factors.  One,  the 
image  motion  due  to  the  size  change  (which  contains  no  three  -  dimens  ior.al 
information)  is  large  compared  to  the  image  motion  due  to  the  perspective 
changes  (which  contains  the  three-dimensional  information).  Two,  the 
trajectories  of  the  size  change  and  the  perspective  change  motion  components 
are  coincident  and  difficult  for  human  observers  to  attend  to  separately. 

Thus  the  large  size  change  motion  component  may  "mask”  the  three- 
dimensional  information  in  the  perspective  motion  component.  In  fact, 
observers  did  not  pick  up  three-dimensional  structure  from  perspective 
changes.  Klymenko,  Weisstein,  and  Ralston  (1987)  also  found  that  pure 


27 


perspective  changes  (with  the  size  change  nulled)  did  not  produce  a  contour, 
nor  were  they  a  good  stimulus  for  recovering  three-dimensional  structure. 
Klymenko ,  Weisstein,  and  Ralston  (1987)  also  found  that  size  changes  imposed 
on  a  figure  rotating  in  depth,  did  not  interfere  with  contour  perceptibility 
or  ability  to  recover  structure  from  motion,  thus  indicating  the  robustness 
of  the  rotation  in  depth  transformation.  It  is  interesting  to  note  that  the 
set  of  experiments  reported  in  Klymenko,  Weisstein,  and  Ralston  (1987) 
indicates  a  parallel  between  contour  perceptibility  and  structure  from 
motion. 

1.23  Models  of  structure  from  motion 

Klymenko  and  Weisstein  (1987a)  introduced  a  new  model  of  structure 
from  motion  or  the  kinetic  depth  effect,  which  they  dubbed  the  resonance 
theory  of  kinetic  shape  perception.  This  three -parameter  model  accounts  for 
more  of  the  structure  from  motion  data  than  any  previous  model,  as  well  as 
the  motion- induced  contour  data  briefly  summarized  above.  In  addition,  the 
model  accounts  for  other  perceptual  phenomena,  such  as  the  pausing  effect, 
the  Ames  trapazoidal  illusion,  the  rubber  pencil  illusion,  and  the 
attenuation  of  rigidity  of  rotating  objects  by  perspective.  In  addition, 
Klymenko  and  Weisstein  (1987a)  clarify  some  philosophical  problems  concerning 
illusions  and  discuss  ecological  optics. 
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